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Abstract 

We  perform  a  comparison  of  mass  conservation  properties  of  the  continuous  (CG)  and  discontinuous  (DG) 
Galerkin  methods  on  non-conforming,  dynamically  adaptive  meshes  for  two  atmospheric  test  cases.  The 
two  methods  are  implemented  in  a  unified  way  which  allows  for  a  direct  comparison  of  the  non-conforming 
edge  treatment.  We  outline  the  implementation  details  of  the  non-conforming  direct  stiffness  summation 
algorithm  for  the  CG  method  and  show  that  the  mass  conservation  error  is  similar  to  the  DG  method.  Both 
methods  conserve  to  machine  precision,  regardless  of  the  presence  of  the  non-conforming  edges.  For  lower 
order  polynomials  the  CG  method  requires  additional  stabilization  to  run  for  very  long  simulation  times. 
We  addressed  this  issue  by  using  filters  and/or  additional  artificial  viscosity.  The  mathematical  proof  of 
mass  conservation  for  CG  with  non-conforming  meshes  is  presented  in  Appendix  B. 
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1.  Introduction 

Both  element-based  Galerkin  methods  and  adaptive  mesh  refinement  for  atmospheric  modeling  are  active 
fields  of  research.  In  [1]  we  provide  an  overview  of  the  literature  covering  both  topics,  with  particular 
attention  to  the  discontinuous  Galerkin  method.  In  this  paper  we  extend  that  work  by  comparing  the 
continuous  (CG)  and  discontinuous  Galerkin  (DG)  methods  with  non-conforming  adaptive  mesh  refinement. 
As  a  metric  for  the  comparison  we  chose  the  mass  conservation,  as  it  is  an  important  feature  for  many 
atmospheric  applications.  We  believe  this  is  a  good  metric  because,  for  smooth  solutions,  both  methods 
will  yield  similar  accuracy;  it  is  unclear  how  both  methods  will  compare  for  non-smooth  solutions  but  it  is 
expected  that  they  can  achieve  similarly  good  results,  especially  with  the  inclusion  of  stabilization  techniques 
for  both  methods.  However,  this  topic  is  beyond  the  scope  of  this  paper  but  is  something  to  address  in  the 
future.  Both  methods  are  implemented  in  a  unified  way,  such  that  they  use  the  same  data  structures  and 
routines  for  computations,  differing  only  in  the  inter-element  communication,  which  is  described  in  detail 
in  Sec.  2.2.  This  approach  allows  us  to  be  confident  that  the  differences  we  see  in  the  results  are  due  to  the 
different  methods  rather  than  their  implementations. 

There  is  a  vast  collection  of  work  in  the  literature  on  CG  with  geometrically  non-conforming  mesh 
refinement,  starting  with  the  early  work  of  [2,  3]  and  later  [4,  5].  Those  approaches  were  based  on  the  mortar 
element  method,  which  employs  integral  projection  to  ensure  a  weak  continuity  across  the  non-conforming 
edges.  In  this  paper  we  follow  the  approach  by  Fischer  et  al.  [6],  that  uses  an  interpolation  based  method 
for  reconciling  the  continuity  condition  on  non-conforming  edges.  The  two  methods  were  compared  in  [7]. 
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[8]  used  the  interpolation  approach  for  geophysical  simulations.  [9]  compared  the  interpolation  based  non- 
conforming  spectral  element  method  with  a  finite  volume  method  for  the  shallow  water  equations  on  the 
sphere.  All  these  papers  discuss  the  mathematical  principles  of  the  interpolation  method,  mentioning  that 
the  presented  mathematical  constructs  (i.e.  scatter  matrix)  are  not  built  directly  in  the  code  but  are  rather 
applied.  We  present  in  this  paper  one  possibility  for  the  implementation  of  the  interpolation  method  (Sec. 
2.4).  It  relies  on  the  concept  of  local  and  global  data  storage  and  can  be  further  optimized  for  parallel 
implementations  to  avoid  two  communication  steps.  We  hope  that  this  algorithm  will  simplify  the  process 
of  the  implementation  of  the  non-conforming  CG  method  for  interested  readers. 

The  main  focus  of  the  paper  is  on  the  mass  conservation  of  the  CG  and  DG  methods.  Excellent  mass 
conservation  properties  are  often  mentioned  as  a  big  advantage  of  the  DG  method  over  its  continuous 
counterpart.  Hughes  et  al.  [10]  refuted  that  statement  arguing  that  the  CG  method  is  both  globally  and 
locally  (with  an  additional  post-processing  step)  conservative.  In  [11]  they  compared  the  CG  and  DG 
methods  and  reached  the  conclusion  that  the  conservation  properties  of  both  methods  (when  using  exact 
integration)  are  the  same  (with  an  additional  post-processing  step  to  obtain  the  local  conservation  for 
CG).  On  the  other  hand,  [9]  briefly  reported  the  lack  of  conservation  in  the  non-conforming  CG  method, 
comparing  it  with  the  conservative  finite  volume  method.  Admittedly,  in  their  work  the  authors  considered 
the  CG  method  on  the  sphere  and  with  the  non-conforming  mesh  refinement,  which  leaves  room  for  arguing 
whether  or  not  those  additional  features  affect  the  conservation.  None  of  the  aforementioned  papers  report 
a  quantitative  measure  of  the  conservation  of  mass,  energy,  or  any  quantity  for  that  matter.  [12]  (with  a 
later  follow-up  in  [13])  addressed  the  conservation  issue  of  the  conforming  continuous  Galerkin  method  on 
the  cubed-sphere  grid  showing  the  mimetic  properties  of  the  HOMME-SE  model  and  extends  [10]  to  inexact 
integration  used  in  the  model.  We  address  the  question  how  non-conforming  adaptive  mesh  refinement 
affects  the  mass  conservation  of  both  the  CG  and  DG  methods.  In  our  study  we  focus  on  the  numerical 
results,  as  they  provide  us  with  information  regarding  not  only  the  method  itself,  but  also  the  quality  of  the 
implementation.  We  focus  on  two-dimensional  Euler  test  cases;  however,  exactly  the  same  methods  can  be 
applied  to  3D  geometries.  In  Appendix  B  we  also  present  a  mathematical  proof  of  conservation  of  the  direct 
stiffness  summation  (DSS)  operation,  which  characterizes  the  CG  method.  The  value  of  this  proof  is  that 
such  a  proof  had  not  been  shown  previously  for  the  case  of  CG  with  non-conforming  meshes. 

The  paper  is  organized  as  follows:  in  Sec.  2  we  outline  the  governing  equations  and  give  an  overview  of 
the  unified  CG/DG  method  implemented  in  the  Non-hydrostatic  Unified  Model  of  the  Atmosphere  (NUMA) 
[14,  15].  We  discuss  the  data  structures  and  routines  shared  by  both  methods  and  point  out  the  differences. 
We  provide  pseudocode  for  the  non-conforming  DSS  operation  for  CG.  In  Sec.  3  we  provide  the  results  of 
the  mass  conservation  study  on  two  atmospheric  test  cases:  the  density  current  and  rising  thermal  bubble. 
We  conclude  the  paper  in  Sec.  4. 


2.  Methods 


2.1.  Governing  equations 

We  use  the  conservative  form  of  the  Euler  equations  described  in  detail  in  [16,  1[.  Here  we  provide  the 
equation  set  for  completeness.  We  write  the  Euler  equations  as  follows 


^  +  V.(„u)  =  0, 
dpu  „  / 

— —  +  V  •  (pu  <g>  u  +  PI) 


dpd 

~dt 


+  V  •  (p0 u)  =  0, 


-pgK 


(i) 


where  p  is  the  density,  u  =  (it,  w)T  is  the  velocity  field,  9  is  the  potential  temperature,  I  is  the  rank-2 
identity  matrix,  k  =  (0, 1)T  is  the  directional  vector  that  points  along  the  2  direction,  g  is  the  gravitational 
acceleration  and  P  is  the  pressure  obtained  from  the  equation  of  state 


P  =  Po 


(2) 
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where  R  =  cp  —  cv  is  the  gas  constant,  Po  is  the  pressure  at  the  lower  boundary,  and  cv  and  cp  are  the 
specific  heats  at  constant  volume  and  pressure  respectively. 

To  stabilize  the  computations  we  add  a  diffusion  term  V  •  (/zpVu)  and  V  •  (ppV0)  to  the  right  hand 
side  of  the  momentum  and  potential  temperature  equations,  respectively.  The  artificial  viscosity  /i  is  varied 
among  the  test  cases.  Note  that  NUMA  is  equipped  to  use  either  the  local  adaptive  viscosity  described  in 
[17],  or  the  dynamic  sub-grid  scale  model  for  LES  (Dyn-SGS)  reported  in  [18].  However,  to  allow  others  to 
replicate  our  results,  in  this  work  we  use  a  constant  /r. 

For  simplicity  we  can  write  equation  set  (1)  in  vector  form 

^+V-F  =  S(q),  (3) 

where  q  is  the  solution  vector,  F(q)  is  the  flux  and  S'(q)  is  the  source  term. 

2.2.  Overview  of  the  CG  and  DG  methods 

The  cornerstone  of  our  comparison  is  the  fact  that  the  NUMA  code  incorporates  both  CG  and  DG 
methods  within  a  unified  framework  [19].  This  means  that  both  methods  use  the  same  data  structures  and 
routines  for  integration,  differentiation  etc.,  which  allows  for  a  direct  comparison  of  the  key  differences.  We 
described  the  DG  approach  in  detail  in  a  previous  publication  (see  [1]).  Here  we  want  to  highlight  the  CG 
method  and  the  details  of  its  implementation  in  NUMA. 

The  Galerkin  machinery  applied  to  Eq.  (3)  gives 

[  ne+  [  n-(V>iF*(q))dTe-  [  V^:-F(qe)dHe=  [  &S( qe)dHe,  (4) 

J  rc  JnB  Jnc 

where  Ue  is  the  standard  element,  Te  is  the  boundary  of  that  element  and  ifi  is  the  test  function.  We 
replaced  F  in  the  boundary  term  by  a  numerical  flux  F*.  In  NUMA  we  use  the  Rusanov  flux  [20].  Note 
that  all  the  terms  are  defined  either  on  an  element  or  on  its  edge  Te.  We  can  further  transform  this  to  a 
discrete  matrix  form  using  the  expansion  qe  qeN  =  V’feQ (where  Np  is  the  number  of  points  in  the 

element  -  for  two-dimensional  quadrilateral  elements  Np  =  ( N  +  l)2,  where  N  is  the  expansion  order)  and 
reorganize  to  keep  only  the  time  derivative  term  on  the  left-hand-side: 

Med|v  =  MeS(qeN)  +  ( De)TF(qeN )  -  (Ms>e)T  F*  (qN)  (5) 

where  Me  is  the  mass  matrix,  Ms’e  is  the  side  mass  matrix,  De  is  the  differentiation  matrix.  A  detailed 
definition  of  those  matrices  can  be  found  in  [1].  At  this  point  it  is  important  to  mention  that  the  mass 
matrix  can  be  constructed  such  that  it  is  diagonal  (by  using  inexact  integration,  see  [16]),  and  therefore 
trivial  to  invert. 

Up  until  now  we  have  made  no  distinction  between  the  CG  and  DG  method.  The  difference  between  the 
two  comes  from  the  fact  that  for  CG  we  require  that  the  solution  to  (3)  be  C° .  Note  that  (5)  is  defined  on  a 
single  (standard)  element.  Only  the  numerical  flux  term  provides  a  communication  between  the  neighboring 
elements,  hence  the  lack  of  e  superscript.  If,  however,  we  assume  that  the  solution  is  C°  across  the  element 
boundaries,  the  numerical  flux  term  will  be  canceled  by  opposite  neighboring  values  at  internal  edges  (on 
the  physical  boundaries,  this  term  will  remain).  To  ensure  the  continuity  and  communication  between 
the  elements  in  the  CG  method,  we  perform  an  operation  traditionally  called  direct  stiffness  summation 
(DSS),  which  involves  a  mapping  of  element-local  degrees  of  freedom  to  their  global  counterparts  followed 
by  summation  (see  [21]  for  details).  The  implementation  of  the  DSS  on  the  non-conforming  interfaces  will 
be  presented  in  Sec.  2.3  along  with  the  discussion  of  the  data  storage.  Additional  discussions  regarding  the 
DSS  operation  on  non-conforming  edges  can  be  found  in  [8,  6,  7].  In  [22]  the  authors  introduce  continuous 
global  basis  functions  for  non-conforming  CG,  which  intrinsically  maintain  continuity  in  general  situations 
in  arbitrary  dimensions. 

To  show  how  the  CG  and  DG  methods  coexist  in  the  NUMA  code,  let  us  assume  that  we  use  explicit 
time  integration.  In  that  case  the  terms  on  the  right-hand-side  of  (5)  will  be  evaluated  at  the  current  time 
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Algorithm  1  CG/DG  algorithm  to  create  RHS 

function  COMPUTERHS(q) 

for  e  =  1,  Ne  do 

>  For  all  elements  in  the  grid 

Re  4-  COMPUTE_VOLUME_INTEGRALS(qe) 

[>  both  methods  compute  volume  integrals 

end  for 

if  CG  then 

>  CG  method 

R  apply  dss(R) 

t>  applies  DSS 

else 

for  e  =  1,  Ne  do 

>  DG  method  for  all  elements 

Re  <-  Re  +  COMPUTE_FLUX_INTEGRALS(qe,  q"e) 

>  computes  flux  to  neighbors  ne 

for  i  =  1,  Np  do 

t>  goes  through  nodal  points  in  each  element 

Rt  4-  Rf  / Mf 

>  and  divides  by  mass 

end  for 

end  for 

end  if 

return  R 

end  function 

14  15 


(a)  ('■ 

Figure  1:  (a)  local  numbering  of  points  (DG  storage),  (b)  global  numbering  of  points  (global  storage) 


and  used  to  advance  qN  in  time.  Algorithm  1  shows  a  pseudocode  for  evaluating  the  right-hand-side  (RHS), 
including  inverting  the  mass  matrix  and  left-multiplying  by  (Afe)_1.  The  COMPUTE_VOLUME_lNTEGRALS 
routine  evaluates  the  first  two  terms  of  the  RHS  of  (5).  This  step  is  common  to  both  CG  and  DG  methods. 
If  the  CG  method  is  chosen,  the  DSS  is  applied  to  the  evaluated  RHS.  Otherwise  the  last  term  of  (5)  is 
computed  by  COMPUTE_FLUX_lNTEGRALS,  the  result  is  added  to  the  existing  RHS  and  the  sum  is  divided 
by  the  mass.  In  the  case  of  the  CG  method  the  division  by  mass  is  handled  within  the  APPLY  DSS  routine. 

Using  this  unified  formulation  of  the  CG/DG  method  we  can  ensure  that  the  only  difference  between  the 
solutions  can  come  from  the  different  handling  of  the  communication  between  elements.  In  the  DG  method 
this  is  done  by  computing  a  flux,  which  is  conservative  by  construction.  In  Appendix  B  we  present  the  proof 
of  conservation  of  the  DSS  operator  used  in  the  CG  method;  this  proof  is  valid  for  both  conforming  and 
non-conforming  meshes.  In  the  following  section  we  discuss  the  data  storage  we  use  in  the  code,  which  is 
closely  related  to  the  implementation  of  the  DSS  operation  for  the  non-conforming  element  edges. 

2.3.  Data  storage 

In  the  NUMA  code,  data  is  stored  in  the  so-called  DG  storage.  This  means  that  instead  of  storing 
variables  as  globally  numbered  vectors,  we  index  the  nodal  solution  by  the  element  number  and  the  element- 
local  point  number.  This  concept  is  graphically  presented  in  Fig.  1.  Panel  (a)  shows  two  elements  with 
nodes  numbered  in  a  local  DG  way,  while  panel  (b)  presents  global  numbering  traditionally  used  for  the  CG 
approach.  Clearly,  the  DG  storage  has  a  bigger  memory  footprint,  as  the  points  which  lie  at  the  interface 
are  numbered  separately  in  each  element.  It  is,  however,  the  only  possible  storage  for  a  DG  method,  as 
it  assumes  no  continuity  between  the  elements.  For  the  CG  method  the  overlapping  points  at  the  element 
edges  will  hold  the  same  value,  even  in  the  DG  storage  approach. 
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Fischer  et.  al.  [6]  explain  the  procedure  for  moving  the  data  between  global  (CG)  and  local  (DG)  storage. 
In  their  formulation  a  Boolean  connectivity  matrix  Q  maps  global  values  of  uGG  to  locally  stored  uDG .  QT 
sums  the  local  values  of  uDG  from  the  corresponding  nodes  and  stores  the  result  in  a  global  vector  vCG. 
Note  that  vCG  ^  uCG ,  instead  we  have  the  following  relations 

uDG  =  Q  uCG ,  vCG  =  Qtudg.  (6) 

While  we  do  not  need  to  use  the  Q  operation  in  the  DG  code,  it  will  be  essential  in  the  CG  part  for  the 
direct  stiffness  summation.  In  NUMA  this  operation  is  performed  in  three  steps  outlined  in  Algorithm  2. 
In  the  code,  the  matrix  Q  is  never  constructed.  Instead,  we  evaluate  its  action,  as  well  as  the  action  of 
its  transpose.  The  details  of  the  DSS  operation  for  the  non-conforming  interface  will  be  discussed  in  the 
following  section. 


Algorithm  2  Direct  stiffness  summation 

function  APPLY  DSS (uDG) 

VCG  qTuDG 

>  Gather  data  to  CG  storage 

for  j  =  1,  Ng  do 

>  go  over  all  global  points 

uGG  =  vGG/mGG 

>  divide  by  mCG  =  QT mPG ,  where  mDG 

end  for 

is  a  vector  of  diagonals  of  Me  for  all  elements  e 

UDG  Q  UCG 

>  scatter  back  to  DG  storage 

return  uDG 

end  function 

Since  the  matrix  Q  is  a  Boolean  map  between  the  local  and  global  node  numbering,  we  implement 
this  map  as  an  interconnectivity  array  ig  =  intma(e ,  ip) ,  where  intma  stands  for  the  INTerconnectivity 
MAtrix,  ig  is  a  global  node  number,  e  is  the  element  number  and  ip  is  a  local  point  number.  The  actions 
of  Q  and  QT  for  conforming  elements  are  detailed  in  Alg.  3  lines  1-6. 

2.f.  Non- conforming  edges 

Non-conforming  edges  arise  when  an  edge  is  shared  by  more  than  two  elements.  Fig.  2  shows  (a)  local 
and  (b)  global  numbering  of  nodes  in  such  a  situation  for  linear  elements.  In  panel  (b)  the  red  circle  (labeled 
"8")  marks  a  hanging  node.  The  value  of  the  solution  there  is  constrained  by  the  C°  continuity  requirement 
of  the  CG  method.  The  DSS  operation  satisfies  this  constraint  in  the  non-conforming  element  situation  by 
an  appropriate  construction  of  the  Q  matrix.  It  is  described  in  detail  in  [6].  Since  it  is  not  constructed  in 
the  code,  we  outline  here  the  implementation  of  its  action  in  the  non-conforming  case.  The  algorithm  for 
the  DSS  operation  is  the  same  for  both  conforming  and  non-conforming  cases. 


(a) 


(3,3) 


(3,4) 


(2,1) 


(2,2) 


(b) 


Figure  2:  Non-conforming  elements:  (a)  local  numbering  of  points  (DG  storage),  (b)  global  numbering  of  points 
(CG  storage).  In  global  numbering,  point  no.  8  is  a  so  called  hanging  node,  as  its  value  is  constrained  by  the  continuity 
condition  between  the  three  elements. 
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Algorithm  3  Scatter  and  gather  operations  Q  and  Q 1  in  a  non-conforming  case 

Qt  action  (gather): 

vCG  =  0 

1:  for  e  =  1,  Ne  do 
2:  for  ip  =  1,  Np  do 

3:  ig  =  intma(e,ip) 

4:  vCG  (ig)  =  vGG  (ig)  +  uDG  (e  ,  ip) 

5:  end  for 

6:  end  for 


7 

for  V  f  G  NCE  do 

7 

for  V  f  G  NCE  do  >  for  non-conforming  edges 

8 

p  4—  parent  (f) 

8 

p  4—  parent  (f) 

>  parent  element 

9 

cl  4—  child(f , 1) 

9 

cl 4—  child(f , 1) 

>  child  element  #1 

10 

c2  4—  child(f  ,  2) 

10 

c2  4—  child(f , 2) 

>  child  element  #2 

11 

for  i=l  ,ngl  do 

11 

for  i=l,ngl  do 

12 

ip=imap(f ,p,i) 

12 

ic=imap(f , cl ,  i) 

>  find  nodes  on  f 

13 

ig=intma(p , ip) 

13 

Ui(i)  4—  uDG(cl  ,ic 

)  >  get  values 

14 

v(i)  4—  uCG( ig) 

14 

m2  (i)  1—  uDG(  c2,ic 

)  from  edge  nodes 

15 

end  for 

15 

end  for 

16 

Ml  =  L\V 

16 

Mi+2  =  L^Ui  +  L2U2 

>  interpolate  &  sum 

17 

m2  =  L2v 

18 

ic=imap(f , cl ,  1) 

>  Correct  corner 

19 

ig=intma(ic) 

20 

«CG(ig)  =tCG(ig)  - 

uDG'(cl,ic) 

21 

ic=imap (f , c2 , ngl) 

>  Correct  corner  ^=2 

22 

ig=intma(ic) 

23 

vGG  (ig)  =vGG(ig)  - 

uDG (c2,ic) 

24:  for  i=l,ngl  do 

25:  ip=imap(f  ,p ,  i) 

26:  ig=intma(p, ip) 

27:  VC'G  (ig)  =  VCG(ig)  +t>i_|_2(i) 

28:  end  for 

29:  end  for 


24 

for  i=l 

,ngl  do 

25 

ic=imap(f ,  cl 

i) 

26 

uDG 

(cl.ic)  = 

Ml 

(i) 

27 

uDG 

(c2, ic)  = 

u2 

(i) 

28 

end  for 

29 

end  for 

Q  action  (scatter): 

1:  for  e=l  ,Ne  do 
2:  forip=l,Apdo 

3:  ig  =  intma(e,ip)  conforming 

4:  uDG  (e  ,ip)  =  uCG  (ig) 

5:  end  for 

6:  end  for 
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Algorithm  3  provides  pseudocode  for  the  routines  evaluating  the  action  of  Q  (gather  operation  -  left) 
and  Q 7  (scatter  operation  -  right).  The  first  six  lines  perform  a  typical  conforming  gather/scatter  operation 
using  the  intma  array.  In  the  non-conforming  case  we  need  to  introduce  adjustments  to  accommodate  the 
hanging  nodes.  In  line  7  the  index  f  loops  over  all  non-conforming  edges.  In  lines  8-10  we  identify  the  parent 
and  children  elements.  The  parent  element  is  the  large  element  sharing  the  edge,  while  the  children  are  the 
small  elements  on  the  other  side  of  this  edge.  The  naming  convention  comes  from  the  fact  that  the  parent 
element  will  ensure  continuity,  which  will  be  enforced  on  the  children  edges.  In  Fig.  2  the  parent  is  the 
element  I,  while  the  children  are  the  smaller  elements  II,  III.  Indices  p,  cl  and  c2  correspond  to  appropriate 
element  numbers  (I,  II  and  III  respectively). 

The  section  of  the  algorithm  in  lines  7-10  is  responsible  for  creating  a  vector  of  edge  values  for  both 
scatter  ( v )  and  gather  (111,1x2)  operations.  Throughout  the  algorithm  Ne  is  the  number  of  elements,  ngl 
is  the  number  of  points  along  an  element  edge  and  Np  is  the  number  of  points  inside  each  element  (for 
quadrilateral  elements  with  tensor-product  basis  functions  Np  =  ngl  x  ngl).  The  array  imap  finds  the 
element  local  number  of  the  ith  point  that  belongs  to  a  given  element  (p ,  cl ,  c2)  and  lies  on  the  edge  f .  In 
the  example  of  Fig.  2v  —  [uGG ,  uGG]  and  for  the  gather  operation  ux  =  [u^l)  ’  u(23)^  and  M2  =  I'u(3<i)>  u(33)]- 

In  lines  16  and  17  for  the  scatter  operation  we  perform  the  interpolation  of  the  vector  v  from  the  parent 
side  of  the  edge  to  two  children  vectors  ui,  1x2.  The  matrices  (Ti)ij  =  ipii^j1)  and  (Z^)*.,-  =  are  the 

interpolation  matrices  from  the  parent  side  of  the  edge  to  each  of  the  two  children  sides  respectively.  Their 
entries  are  the  values  of  the  ID  parent  edge  basis  functions  ipi  at  the  children  side  nodal  points  ^)1,c2.  In 
Fig.  2  example  £cl  =  [£(2,i), £(2,3)]  and  £c2  =  [£(3,1) >£(3,3)]  or>  using  global  node  numbering,  £cl  =  [£2,£8], 
£c2  =  [Cs,  £4]  •  The  £  coordinate  is  measured  in  the  parent  element  coordinate  system.  All  the  interpolations 
are  evaluated  on  the  standard  element  in  the  computational  space  £  €  [—1, 1],  therefore,  the  matrices  X12 
are  computed  only  once  and  used  for  all  elements. 

For  the  gather  operation,  in  line  16  we  use  the  transpose  of  the  interpolation  matrix  to  compute  the 
contributions  from  the  children  elements  to  the  parent  side  of  the  edge.  The  contributions  are  summed, 
but  before  they  can  be  applied  to  the  solution  gathered  in  the  CG  storage,  we  need  to  account  for  the 
fact  that  some  contributions  from  the  corner  nodes  were  already  added  in  the  conforming  step.  In  Fig.  2 
example,  data  from  points  (2, 1)  and  (3,  3)  are  gathered  to  global  points  2  and  4  respectively.  The  summed 
contributions  i>i+2  account  for  contributions  of  local  points  (2, 1),  (2, 3),  (3, 1),  (3,  3)  to  points  2  and  4  in 
global  storage.  We  avoid  double  counting  by  subtracting  and  uGG^  from  uGG  and  uGG  respectively, 

which  is  executed  in  lines  18-23  of  the  gather  operation.  There  are  other  possibilities  for  executing  these 
operations  but,  we  feel,  this  is  the  most  sensible  way  since  the  standard  conforming  DSS  operation  is  first 
used  with  the  correction  for  non-conformity. 

In  lines  24-29  of  the  scatter  operation  we  copy  the  interpolated  vectors  1x1  and  u 2  to  the  local  storage. 
For  the  gather  operation  the  summed  solution  tq+2  is  added  to  appropriate  locations  in  the  global  CG 
storage.  This  concludes  both  operations. 

In  the  non-conforming  case,  the  DSS  algorithm  outlined  in  Alg.  2  remains  unchanged.  The  only  difference 
lies  in  the  construction  of  the  Q  matrix.  In  the  example  shown  in  Fig.  2,  we  first  perform  the  gather  operation, 
to  account  for  the  contributions  of  nodes  (2, 1),  (2,  3),  (3, 1),  (3, 3)  to  global  nodes  2, 4.  Once  complete,  we 
divide  the  gathered  solution  by  the  gathered  mass  matrix.  Since  both  the  solution  and  the  mass  matrix  are 
gathered  in  the  same  way,  this  division  is  equivalent  to  a  weighted  average  of  multiple  contributions  from 
local  nodes  to  a  single  global  node.  The  weight  of  each  local  node  is  its  corresponding  mass  matrix  entry. 
The  final  DSS  step  is  to  scatter  the  data  back  to  the  local  storage  using  the  Q  matrix.  Since  the  scatter 
operation  is  essentially  an  interpolation,  this  procedure  ensures  that  the  solution  is  continuous  across  the 
non-conforming  interfaces. 

A  good  check  for  the  correct  implementation  of  the  Q  matrix  is  to  construct  it  explicitly  for  a  small  prob¬ 
lem  and  inspect  its  entries1.  For  comparison,  in  Appendix  A  we  provide  the  Q  matrix  for  both  conforming 
and  non-conforming  examples  presented  in  Fig.  1  and  Fig.  2.  See  also  [8]  for  a  similar  analysis. 


1  We  are  grateful  to  Lucas  Wilcox  for  suggesting  this  idea. 
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2.5.  Computation  of  the  mass 


One  of  the  objectives  of  this  paper  is  to  discuss  the  mass  conservation  properties  of  the  continuous  and 
discontinuous  Galerkin  methods  with  non-conforming  mesh  refinement.  We  measure  the  mass  conservation 
error  as  follows: 


M(t) 


|m(0)  —  m(t)  | 
rn(0) 


(7) 


where  m(t)  =  fn  p(t)  clfl  is  the  total  mass  of  the  system  at  time  t.  The  way  the  total  mass  m(t)  is  computed 
is  important.  As  explained  in  [1],  simple  sequential  summation  of  mass  contributions  from  all  the  nodal 
points  can  result  in  a  considerable  roundoff  error,  which  blurs  the  outcome  of  the  mass  conservation  study. 
In  order  to  control  the  roundoff  error  in  the  mass  computation,  we  use  a  pairwise  summation  algorithm 
outlined  in  [23] 2 . 


3.  Results 

3.1.  Test  cases 

In  this  study  we  use  two  standard  atmospheric  test  cases:  the  density  current  [24]  and  the  rising  thermal 
bubble  [16].  They  both  display  features  that  move  dynamically  across  the  entire  domain,  which  allows 
for  using  adaptive  mesh  refinement  (AMR)  effectively.  The  difference  between  the  cases  is  that  the  rising 
thermal  bubble  allows  for  using  larger  Courant  numbers3.  Following  the  findings  of  the  previous  paper 
([1])  we  use  an  IMEX  ARK3  (3rd  order  additive  Runge-Kutta  method)  time  integrator  for  its  robustness 
with  AMR  simulations.  For  both  test  cases,  we  use  artificial  viscosity  to  ensure  stability  of  the  simulations. 
We  run  both  tests  for  a  much  longer  time  than  usual  to  be  able  to  reliably  report  the  mass  conservation 
properties.  The  cases  are  identical,  apart  from  the  simulation  time,  to  the  ones  used  for  our  previous  study 
[  1] .  Here  we  outline  them  for  the  sake  of  completeness. 


3.1.1.  Case  1:  Density  current 

The  case  consists  of  a  bubble  of  cold  air  dropped  in  a  neutrally  stratified  atmosphere.  The  bubble 
eventually  hits  the  lower  boundary  of  the  domain  (no  flux  wall)  and  moves  horizontally  shedding  Kelvin- 
Helmholtz  rotors  4.  In  order  to  obtain  the  grid-converged  solution  we  apply  artificial  viscosity  p  =  75m2 /s. 
The  initial  condition  is  defined  in  terms  of  potential  temperature  perturbation 


where  9C 


— 15K,  r  = 


for  r  >  rc 

1  +  cos  (^))  for  r  <  rc  ’ 


(8) 


2 

and  rc  =  1.  The  domain  is  defined  as  (x,z)  £  [0,  25600]m  x 


[0,6400]m  and  the  center  of  the  bubble  is  at  ( xc,zc )  =  (0,3000)m  with  the  axes  of  the  bubble  defined  by 
( xr,zr )  =  (4000, 2000)m.  The  boundary  conditions  for  all  four  boundaries  are  no-flux  walls,  i.e.  u|p  =  0. 
The  velocity  field  is  initially  set  to  zero  everywhere.  In  order  to  obtain  credible  mass  conservation  data  we 
run  the  simulation  up  to  t  =  10000s,  which  is  over  10  times  longer  than  the  usual  final  time  for  this  test  case 
(typically  t  =  900s).  This  time  is  enough  for  the  bubble  to  traverse  the  length  of  the  domain  three  times. 


2We  are  grateful  to  Jeremy  Kozdon  for  this  suggestion. 

3  The  reason  for  this  is  due  to  the  disparity  of  the  size  of  the  acoustic  waves  compared  to  the  other  waves  in  the  system. 
Since  we  use  IMEX  (implicit-explicit)  methods,  we  can  use  time-steps  that  allow  us  to  step  completely  over  the  acoustic  waves. 


3.1.2.  Case  2:  Rising  thermal  bubble 

In  this  test  case  a  warm  bubble  rises  in  a  constant  potential  temperature  atmosphere  (the  reference 
potential  temperature  9  =  300 K,  9  =  9  +  9').  As  it  rises,  it  deforms  until  it  forms  a  mushroom  shape 
and  hits  the  top  wall  at  a  later  time.  Initially,  the  air  is  at  rest  and  in  hydrostatic  balance.  The  initial 
potential  temperature  perturbation  is  given  by  Eq.  (8)  with  9C  =  0.5K  and  rc  =  250m.  The  domain  has 
dimensions  (x,z)  £  [0,  lOOOjm  x[0, 1000]m  and  the  bubble  is  positioned  at  (xc,zc)  =  (500,350)m.  The 
boundary  conditions  for  all  sides  are  no-flux.  The  simulation  runs  until  t  =  5000s.  In  this  case  we  try  to  use 
as  little  artificial  viscosity  as  possible,  just  enough  to  stabilize  the  simulation.  We  provide  the  discussion  on 
the  artificial  viscosity  used  in  Sec.  3.2.2. 

3.2.  Mass  conservation 
3.2.1.  Case  1  -  density  current 

Figure  3  shows  the  time  evolution  of  the  mass  conservation  error  for  Case  1  on  different  mesh  configu¬ 
rations.  Fig.  3a  presents  the  results  for  a  uniform  mesh  of  128  x  32  elements  with  polynomial  orders  of  5 
and  7  (blue  and  red  markers,  respectively),  and  64  x  16  elements  with  polynomial  order  10  (green  markers). 
In  this  way  the  simulations  for  polynomial  order  5  and  10  (nop=5  and  nop=10)  have  the  same  effective 
resolution,  while  the  nop=7  simulations  have  slightly  increased  resolution.  For  each  polynomial  order,  the 
results  are  reported  using  both  CG  and  DG  (circular  and  triangular  markers,  respectively).  The  dashed 
black  line  shows  the  machine  precision  level  of  10”16. 

The  uniform  mesh  test  is  designed  to  be  a  benchmark  against  which  we  will  test  the  mass  conservation 
properties  of  non-conforming  simulations.  Figure  3a  shows  that  for  all  the  simulations  the  mass  error  is 
contained  within  10-15.  Moreover,  the  mass  conservation  error  is  virtually  identical  for  both  methods  for 
all  polynomial  orders. 

(a)  (6) 


(c) 


Figure  3:  Case  1:  Mass  conservation  as  a  function  of  simulation  time  for  (a)  uniform  mesh,  (b)  statically  refined 
mesh,  (c)  dynamically  adaptive  mesh.  Triangular  markers  denote  the  CG  results  and  circular  markers  represent  the 
DG  results.  Colors  represent  different  polynomial  orders. 
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The  static  mesh  test  is  designed  to  check  how  the  mass  is  conserved  when  the  cold  front  is  moving 
through  the  non-conforming  mesh  interfaces.  The  mesh  is  refined  near  the  vertical  domain  boundaries  and 
coarsened  in  the  middle  (Fig.  4).  The  intermediate  region  is  created  following  the  2:1  balancing  constraint 
described  in  [1].  The  results  are  reported  in  Fig.  3b  using  the  same  color  and  marker  convention  as  for  the 
uniform  test.  Similarly  as  in  the  uniform  simulations,  the  mass  conservation  error  remains  under  10~15  for 


Figure  4:  Static  mesh  for  the  density  current  case.  The  mesh  is  refined  near  the  vertical  edges  of  the  domain  to  a 
5th  level  mesh  (32  elements  in  the  vertical  direction,  which  corresponds  to  the  resolution  of  the  reference  simulation). 
The  middle  of  the  domain  is  kept  at  2nd  level  mesh  (coarsened  three  times  to  4  elements  in  the  vertical  dimension). 
The  results  for  10th  order  polynomials  are  obtained  on  the  mesh  one  level  coarser  for  both  refined  and  coarse  regions 
to  keep  the  effective  resolution  comparable  between  5th  and  10th  order.  The  color  contours  show  the  potential 
temperature  perturbation. 

the  majority  of  the  tests.  The  worst  exception  is  the  CG  nop=5  simulation,  where  the  mass  conservation 
error  is  approximately  a  level  of  magnitude  higher.  Apart  from  this  result,  other  polynomial  orders  show 
similar,  albeit  not  identical,  mass  conservation  properties  between  CG  and  DG. 

Fig.  3c  presents  the  results  of  dynamically  adaptive  simulations,  where  the  refined  mesh  tracks  the  areas 
where  the  absolute  value  of  potential  temperature  perturbation  exceeds  0.1K.  Comparing  to  the  static  mesh 
simulations,  here  the  mass  conservation  error  is  slightly  increased  but  still  bounded  in  time.  The  only 
simulation  with  significantly  different  behavior  is,  again,  the  CG  nop=5  simulation.  For  all  the  other  cases 
the  CG  and  DG  methods  yield  virtually  the  same  mass  errors. 

For  all  three  results  presented  in  Fig.  3  it  is  important  to  bear  in  mind  that  we  operate  at  error  levels  at 
or  near  machine  precision.  For  that  reason  it  is  difficult  to  make  definitive  statements  about  which  result 
is  better,  unless  the  difference  is  an  order  of  magnitude.  Even  though  great  care  was  taken  to  make  sure 
the  mass  conservation  is  computed  accurately,  we  cannot  guarantee  it  is  free  of  roundoff  error  (see  [1]).  All 
the  results  have  a  similar  level  of  mass  error,  therefore  we  can  conclude  that  at  this  level  of  precision  the 
AMR  simulations  have  similar  conservation  properties  as  the  uniform  simulations.  There  is  no  significant 
difference  between  the  CG  and  DG  methods,  even  though  the  treatment  of  the  non-conforming  interface  is 
rather  different.  The  only  exception  is  the  nop=5  non-conforming  simulation  (both  for  static  and  adaptive 
mesh),  which  shows  different  behavior  between  the  CG  and  DG  methods.  Since  this  behavior  was  consistent 
between  static  and  adaptive  mesh  simulations,  it  may  be  an  indication  that  lower  order  non-conforming 
meshes  with  CG  may  be  less  robust  with  regards  to  mass  conservation  than  their  DG  counterparts.  It  is 
worthwhile  to  mention  here  that  we  use  inexact  integration,  which  may  be  the  cause  of  this  issue.  [25]  shows 
that  in  the  conforming  mesh  case  one  should  not  go  below  nop=4  to  avoid  inexact  integration  error.  The 
presence  of  the  non-conforming  edges  may  amplify  this  effect  for  the  CG  method.  Another  explanation  of 
this  issue  may  be  that  even  though  the  nop=5  and  nop=10  simulations  have  the  same  effective  resolution, 
the  higher  order  method  in  fact  resolves  the  problem  better.  It  is  possible  that  the  nop=5  simulation  with 
mesh  refinement  has  marginal  or  too  low  resolution  at  certain  times  in  this  problem  and  this  manifests  itself 
in  the  mass  conservation  error  being  slightly  higher. 

For  adaptive  and  uniform  meshes  we  have  run  a  series  of  simulations  with  different  effective  resolutions 
to  check  if  the  mass  conservation  properties  are  affected  by  the  mesh  resolution.  In  Fig.  5  we  plot  the 
time-average  of  the  mass  conservation  error  against  the  effective  resolution  of  the  simulation  for  different 
polynomial  orders  (colors)  and  methods  (markers).  The  results  for  the  adaptive  mesh  show  slightly  higher 
mass  conservation  error.  In  both  cases  there  is  no  visible  effect  of  the  resolution  on  the  average  mass 
conservation  error. 
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(a) 


(&) 


Figure  5:  Case  1:  Mass  conservation  against  the  effective  resolution  for  (a)  uniform  mesh  and  (b)  adaptive  mesh. 
Triangular  markers  denote  the  CG  results  and  circular  markers  represent  the  DG  results.  Colors  represent  different 
polynomial  orders. 


3.2.2.  Case  2  -  rising  thermal  bubble 

The  density  current  test  case  discussed  in  Sec.  3.2.1  (Case  1)  is  characterized  by  very  high  viscosity, 
which  was  introduced  to  guarantee  convergence  (see  [24]).  In  order  to  investigate  the  mass  conservation 
errors  for  lower  artificial  viscosity  levels,  we  use  the  rising  thermal  bubble  test.  For  usual  simulation  times 
of  t  =  700s  and  effective  resolutions  in  the  range  of  single  meters,  the  value  of  viscosity  was  typically  set 
to  fj,  =  0.1m2 /s  (see  [16,  26]).  In  this  study,  unless  otherwise  stated,  we  chose  /i  =  0.5m2/s  to  stabilize  the 
adaptive  bubble  simulations  until  t  =  5000s. 

Figure  6a  presents  the  reference  uniform  simulation  results.  The  mesh  consists  of  32  x  32  elements  with 
two  different  polynomial  orders  (nop=5  blue  markers  and  nop=7  red  markers).  Both  polynomial  orders  show 
the  mass  conservation  error  limited  by  10-14.  The  results  are  identical  for  both  the  CG  and  DG  methods. 

For  the  adaptive  mesh  simulations,  in  addition  to  polynomial  orders  5  and  7  we  have  run  a  nop=10 
simulation  with  the  smallest  element  size  corresponding  to  the  elements  of  the  uniform  16  x  16  mesh.  This 
gives  the  same  effective  resolution  as  the  32  x  32  nop=5  simulation.  Figure  6b  shows  the  comparison  of 
the  adaptive  mesh  results  for  the  CG  and  DG  methods  (triangular  and  circular  markers,  respectively)  for 
different  polynomial  orders.  For  all  the  results  the  mass  conservation  error  is  below  10-14.  Unlike  in  the 
previous  test  case,  the  results  are  not  identical  between  the  DG  and  CG  methods  for  all  polynomial  orders4. 
For  CG  with  polynomial  order  5  (blue  triangles),  the  simulation  did  not  complete  successfully  (for  5000 
seconds)  for  the  artificial  viscosity  /i  =  0.5m2 /s. 

Bearing  in  mind  that  in  the  previous  test  case  (Case  1),  the  CG  simulations  with  nop=5  behaved 
differently  than  the  rest,  we  increased  the  viscosity  to  /. i  =  5.0—  (we  tried  a  range  of  viscosity  values  from 
0.1  to  5.0m2/s  and  only  the  highest  one  resulted  in  a  successfully  completed  simulation  at  5000  seconds  - 
note  that  all  the  tests  were  successful  at  the  usual  final  time  of  t  =  700s,  regardless  of  the  viscosity).  The 
result  of  this  simulation  is  reported  in  Fig.  6c.  With  significantly  larger  viscosity  the  nop=5  CG  simulation 
(red  triangles)  completed  and  had  mass  conservation  properties  consistent  with  the  results  presented  in 
panel  (b) .  The  DG  simulation  with  increased  viscosity  (red  circles)  behaved  similarly,  yet  not  identically,  to 
the  lower  viscosity  one  (blue  circles). 

An  alternative  method  of  stabilization  is  to  use  a  filter  which  removes  high  frequency  oscillations  from 

2 

the  solution.  We  designed  an  experiment  where  for  polynomial  order  5  simulations  with  viscosity  /i  =  0.5^- 
we  use  the  Boyd-Vandeven  filter  (see  [27,  16]).  By  design  this  filter  is  not  supposed  to  change  the  mean 
of  the  filtered  field,  trimming  only  the  higher  order  oscillations.  The  result  is  presented  in  Fig.  6d.  The 
filtered  CG  simulation  (red  triangles)  completed  for  lower  viscosity,  while  the  mass  conservation  error  for 


4The  different  behavior  between  CG  and  DG  is  not  unexpected  for  this  case.  The  initial  condition  used  for  both  tests  is 
non-smooth  (not  infinitely  differentiable)  but  the  key  difference  is  that  the  large  amount  of  viscosity  used  in  the  previous  case 
was  sufficient  to  suppress  discontinuities  whereas  in  this  case  (due  to  the  small  amount  of  diffusion)  it  is  not. 
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(a) 


(&) 


Figure  6:  Case  2:  Mass  conservation  error  as  a  function  of  time  for  (a)  uniform  mesh,  (b)  adaptive  mesh,  (c) 
adaptive  mesh  with  increased  viscosity  and  (d)  adaptive  mesh  with  filtering.  Triangular  markers  denote  the  CG 
results  and  circular  markers  represent  the  DG  results.  Colors  represent  different  polynomial  orders  (panel  (a)  and 
(b)),  different  viscosity  parameter  (c)  or  the  use  of  filtering  (d). 


the  DG  method  remained  unchanged  (blue  and  red  circles  overlap).  We  have  to  report,  however,  that  for 
the  original  viscosity  of  /x  =  0.1—  the  CG  nop=5  simulation  did  not  complete  even  with  the  filter  (for 
t  =  5000s).  It  is  not  too  surprising,  perhaps,  since  the  CG  method  has  no  inherent  stabilization  while  our 
DG  uses  a  Rusanov  numerical  flux,  which  is  highly  dissipative  and  thus  more  robust. 

To  further  investigate  the  stability  issue  discovered  in  Fig.  6  we  ran  a  set  of  simulations  with  various 
artificial  viscosity  values  and  different  polynomial  orders  (without  a  filter).  The  simulation  times  of  those 
runs  are  reported  in  Table  I,  where  the  final  time  of  5000s  (in  bold  type)  marks  a  successfully  completed 
simulations,  and  any  smaller  value  (italic)  represents  a  time  at  which  the  simulations  broke  down  5.  All 


5In  our  case  the  simulation  break-down  happened  by  either  the  simulation  producing  an  NaN  value  or  GMRES  exceeding 
the  iteration  count  limit. 


Table  I:  Case  2:  CG  simulation  time  (in  seconds)  for  different  polynomial  orders  and  artificial  viscosity  parame¬ 
ters.  Bold  numbers  symbolize  simulations  that  completed  successfully.  Values  in  italic  mark  the  time  at  which  the 
simulations  broke-down. 


s 

nop=5 

nop=6 

nop=7 

0.1 

716 

897 

1110 

0.5 

956 

994 

5000 

2.0 

1034 

5000 

5000 

3.0 

1578 

5000 

5000 

5.0 

5000 

5000 

5000 

12 


the  results  were  obtained  for  the  CG  method  with  dynamic  AMR  for  Case  2,  for  the  DG  method  all  the 
simulations  completed  successfully.  Note  that  for  the  usual  final  time  of  t  =  700s  all  CG  simulations  would 
be  considered  successful  as  well.  It  is  clear  that  with  increased  artificial  viscosity,  the  simulations  become 
more  stable.  As  a  general  rule  we  can  say  that  for  this  case  all  the  simulations  with  polynomial  order  greater 
than  or  equal  to  6  and  artificial  viscosity  of  at  least  fx  =  2.0m2/s  are  stable  until  t,  =  5000s.  There  seems 
to  be  a  critical  time  after  which  the  simulation  never  breaks  down.  The  times  in  Table  I  indicate  that  this 
critical  time  is  around  t  «  1000  —  1500s,  which  coincides  with  the  bubble  impinging  on  the  top  domain 
boundary.  It  is  possible  that  the  complexity  of  the  flow  impinging  at  the  domain  boundary  is  not  fully 
resolved  by  certain  choices  of  mesh  and  polynomial  order.  High  artificial  diffusion  stabilizes  the  solution 
even  for  under-resolved  flows. 

The  question  that  naturally  arises  is  why  is  DG  more  robust  than  CG  for  AMR  with  low  viscosity  and 
lower  polynomial  order?  We  believe  the  reason  is  due  to  the  additional  dissipation  mechanism  offered  by 
the  Rusanov  flux.  Recall  that  this  flux  can  be  written  as 

F*  =  {F^’V}  -  \\\h[q(e’%  (9) 

where  {  }  denotes  the  mean  value  and  [  ]  is  the  jump  in  standard  DG  notation.  The  jump  term  is  essentially 
an  adaptive  viscosity  that  gets  stronger  with  large  jumps  (the  difference  in  the  solutions  on  each  side  of  the 
edge).  Therefore,  to  make  the  comparison  more  fair  between  CG  and  DG  would  require  the  inclusion  of 
the  additional  adaptive  viscosity  to  the  CG  scheme.  We  reserve  the  study  of  how  best  to  construct  such  a 
dissipative  mechanism  for  future  work. 


4.  Conclusion 

In  this  paper  we  have  investigated  the  mass  conservation  properties  of  a  unified  CG/DG  method  with 
non-conforming  adaptive  meshes  using  two  standard  atmospheric  test  cases.  The  unification  of  the  CG  and 
DG  method  allowed  us  to  compare  directly  the  different  treatment  of  the  non-conforming  interfaces,  because 
every  other  aspect  of  the  code  executes  the  same  routines  for  both  methods. 

The  study  showed  that  the  mass  conservation  error  is  not  significantly  affected  by  the  non-conforming 
mesh  compared  with  the  uniform  mesh.  At  the  levels  of  error  near  machine  precision  we  cannot  be  sure  if  the 
small  difference  we  see  in  the  plots  comes  from  the  presence  of  the  non-conforming  faces,  the  roundoff  error 
in  the  algorithm,  or  the  roundoff  error  committed  while  computing  the  mass.  We  attempted  to  minimize 
the  mass  computation  error  by  using  a  pairwise  sum  algorithm  and  have  included  a  mathematical  proof  on 
the  conservation  of  mass  for  CG  with  non-conforming  meshes. 

The  difference  between  the  CG  and  DG  mass  conservation  error  is  small  and  due  to  a  different  treatment 
of  the  non-conforming  edges.  For  conforming  uniform  meshes  there  is  no  difference  in  mass  conservation 
between  the  CG  and  DG  methods,  unless  the  resolution  is  very  coarse  (order  of  200-300m  in  the  density 
current  case). 

Lower  order  non-conforming  CG  simulations  are  less  robust  than  the  DG  simulations  and  require  more 
stabilization.  Polynomial  order  5  simulations  require  artificial  viscosity  of  fx  =  5.0m2/s  to  complete  5000s 
simulation,  while  higher  polynomial  orders  require  /i  =  2.0m2/s.  We  were  also  successful  with  running 
polynomial  order  7  with  fx  =  0.5m2/s.  None  of  the  DG  simulations  required  any  additional  stabilization, 
even  for  the  very  long  simulation  times  used  in  this  study.  We  showed  that  the  stability  problem  in  the  CG 
non-conforming  method  can  also  be  solved  using  filtering  or  a  combination  of  filtering  and  artificial  viscosity. 
We  are  currently  investigating  other  stabilization  methods  as  well.  It  is  possible  that  adaptive  local  viscosity 
can  stabilize  the  non-conforming  simulations  without  applying  artificial  viscosity  in  the  entire  domain. 

Another  possible  explanation  of  the  stability  issue  is  that  certain  regions  of  the  domain  are  under-resolved 
and  therefore  the  CG  method  is  unstable.  Our  refinement  criterion  is  designed  to  follow  certain  physical 
features  and  is  not  an  error  indicator,  which  would  pick  up  any  growing  instabilities. 

In  the  scope  of  this  paper  only  the  issue  of  mass  conservation  was  discussed.  In  that  comparison  both 
CG  and  DG  non-conforming  AMR  methods  perform  equally  well.  We  will  address  the  issues  of  efficiency 
and  accuracy  in  a  separate  paper. 
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Appendix  A 

In  this  appendix  we  provide  explicitly  the  Q  arrays  for  both  conforming  (Fig.  1)  and  non-conforming 
(Fig.  2)  examples. 

Conforming  Q  matrix: 
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Non-conforming  Q  matrix: 
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Note  that  the  non-Boolean  entires  come  from  the  interpolation  between  parent  and  children  elements. 


5.  Appendix  B  -  Proof  of  Mass  Conservation  by  the  DSS 

Let  us  introduce  the  following  theorem 
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Theorem  1.  Global  mass  is  conserved  by  the  DSS  operator  in  the  CG  discretization  regardless  of  whether 
the  mesh  is  conforming  or  non- conforming. 

The  proof  of  this  theorem  is  accomplished  across  two  subsections.  In  Sec.  5.1  we  define  some  essential 
properties  (Lemmas)  that  are  used  in  the  proof  described  in  Sec.  5.2. 

5.1.  Definitions  and  Properties  of  Matrices 

Let  us  define  Q  £  Rmxn  as  the  scatter  matrix  that  scatters  the  CG  (£  C0)  solution  of  dimension  n  to 
the  DG  solution  of  dimension  m.  Let  Q 1  £  Rnxm  define  gather  matrix  that  gathers  the  DG  C° )  of 
dimension  m,  to  the  CG  solution  (e  C°)  of  dimension  n.  The  dimension  rn  is  defined  as  m  =  Ne(N  +  l)d 
where  Ne  are  the  number  of  elements  in  the  grid,  N  is  the  polynomial  order  in  the  CG  method,  and  d  is  the 
spatial  dimension.  The  dimension  n  is  such  that  n  <  m  and  defines  the  distinct  global  gridpoints.  Examples 
of  simple  Q  matrices  are  given  in  Appendix  A. 

Let  us  define  a  diagonal  matrix  with  integration  weights  and  Jacobians  as  W  £  Rmxm,  which  is  built 
from  the  local  element-wise  mass  matrices  in  a  DG  method.  The  corresponding  CG  mass  matrix  is  defined 
as  follows: 

M  =  QtWQ  £  Rnxn.  (10) 

Let  us  now  review  some  properties  of  the  matrices  Q  and  W.  The  rows  i  and  columns  j  of  the  matrix 
Q  are  in  fact  defined  as  follows:  Q,j  =  Lj(Cj)  where  i  =  1  j  =  l,...,n,  L(£)  are  the  Lagrange 

polynomials  that  we  use  for  interpolation,  and  £  are  the  interpolation  points  in  multiple  dimensions.  To 
make  the  following  exposition  easier  to  follow,  (although  not  necessary  for  the  proof),  one  can  assume  that 
the  multi-dimensional  Lagrange  polynomials  are  derived  from  the  tensor-product  bases  in  one-dimension. 
This  further  simplifies  the  exposition  by  allowing  us  to  assume  that  we  are  only  interested  in  a  single  non- 
conforming  edge  (a  one-dimensional  construct).  If  we  define  the  one-dimensional  Lagrange  polynomials  as 
h(f)  we  can  then  rewrite  =  hj(£i). 

Lagrange  polynomials  enjoy  the  property  of  the  partition  of  unity,  i.e. ,  V£,:  £  [—1,  +1]  Yl'j=i  hj{£i)  =  1- 
Therefore,  all  of  the  rows  (*  =  1,  ...,m)  of  Q  sum  to  1.  This  means  that  =  h:j  (£.;)  has  the  property  that 
all  the  columns  (i  =  1,  ...,m)  sum  to  1.  Let  us  call  this  the  Summation  Property  of  Q1  (see  also  [22]). 

Next,  we  need  to  show  that  the  matrices  Q1  W  and  M  =  Q1  W Q  are  of  full  rank.  The  matrix 
W  £  Rmxm  is  of  full  rank  m  since  the  diagonal  elements  are  the  weights  and  Jacobians  of  the  integration  - 
the  LGL  weights  are  always  positive  and  the  Jacobians  are  positive  for  well-defined  grids  (since  they  represent 
the  volume  of  the  element).  For  the  matrix  QT  £  Rnxm  we  note  that  it  has  n  linearly-independent  rows 
by  construction  (since  it  defines  the  n  distinct  global  gridpoints).  Therefore,  Q  £  Rmxn  has  n  linearly- 
independent  columns  and,  therefore,  the  products  QrW  £  Rnxm  and  M  =  QJWQ  £  Rnxn  will  both 
have  full  rank  n.  Since  M  is  square  and  of  full  rank,  then  it  is  invertible. 

5.2.  Proof  of  Mass  Conservation 

To  prove  that  the  mass  conservation  is  unchanged  by  the  Direct-Stiffness-Summation  (DSS)  operator, 
we  begin  by  assuming  that  we  have  a  solution  vector  (say,  for  density)  defined  as  qDG  which  is  strictly  a 
local  element-wise  vector  such  that  qDG  £  Rm  and  qDG  fL  C°. 

Step  1  We  apply  local  element-wise  integration  which  is  represented  as  the  matrix-vector  operation 

WqDG  £  Rmxl. 

Step  2  We  gather  the  solution  to  all  the  n  distinct  global  gridpoints  via  the  matrix  multiplication 

QTWqDG  £  Rnxl. 

This  is  the  global  assembly  operation  typical  of  continuous  finite  element  methods. 
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Step  3  We  multiply  by  the  inverse  global  mass  matrix  to  bring  the  integral  solution  of  Step  2  to  a  global 
gridpoint  solution  as  follows 

M~lQTWqDG  G  Rnxl. 

The  resulting  vector  lives  in  the  n  distinct  global  gridpoints  and  is  in  C°  (CG  space). 

Step  4  Finally,  we  scatter  the  solution  to  the  m  non-distinct  local  gridpoints  as  follows 

qSG  =  QM~1QTWqDG  G  Rmxl.  (11) 

The  resulting  vector  on  the  left-hand-side  lives  in  the  DG  space  (of  dimension  m)  but  is  in  C°.  The 
original  vector  qDG  need  not  live  in  C°. 

Step  5  Let  us  now  replace  the  mass  matrix  in  Eq.  (11)  using  Eq.  (10)  which  results  in 

q^G  =  q  ^ QTWQ )  1  QTWqDG.  (12) 

Equation  (12)  is  defined  in  [13]  as  the  projection  and  self-adjoint  matrix  (see  also  [22]). 

Step  6  Since  QTW  is  of  full  rank,  then  we  can  left-multiply  Eq.  (12)  by  it  to  obtain 


QTWqg?  =  ( QtWQ )  ( QtWQ )  1  QTWqDG. 


which  simplifies  to 

QT  W  qDG  =  QTWqDG. 


(13) 

(14) 


Step  7  Introducing  the  vector  e„  =  (1, ...,  1)T  G  Rnxl  and  taking  the  scalar  product  of  this  vector  with  Eq. 
(14)  yields 

eTnQTWqG oG  =  eTnQTWqDG  (15) 

where  eflQT  =  ejn,  by  virtue  of  the  Summation  Property  of  QT .  This  simplifies  Eq.  (15)  to 

e™WqS!?  =  z’rnWq00  (16) 

which  states  that  the  global  sum,  weighted  by  W  is  identical  for  both  the  DG  solution  and  the  DG 
solution  after  it  has  been  made  to  be  C°.  The  role  of  W  is  to  perform  the  local  integration  of  qDG, 
while  the  scalar  product  with  e  is  the  global  sum.  Therefore,  the  global  mass  is  conserved. 


17 


